shape bias
db5f9f42a7157abe65bb145000b5871a-Paper.pdf
Recent workhasindicated that,unlikehumans, ImageNet-trained CNNs tendto classify images by texture rather than by shape. How pervasiveis this bias, and wheredoesitcomefrom? Wefindthat,whentrainedondatasets ofimageswith conflicting shape and texture, CNNs learn to classify by shape at least as easily as by texture. What factors, then, produce the texture bias in CNNs trained on ImageNet?
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity
Current deep-learning models for object recognition are known to be heavily biased toward texture. In contrast, human visual systems are known to be biased toward shape and structure. What could be the design principles in human visual systems that led to this difference? How could we introduce more shape bias into the deep learning models? In this paper, we report that sparse coding, a ubiquitous principle in the brain, can in itself introduce shape bias into the network.
Spatial-frequency channels, shape bias, and adversarial robustness
What spatial frequency information do humans and neural networks use to recognize objects? In neuroscience, critical band masking is an established tool that can reveal the frequency-selective filters used for object recognition. Critical band masking measures the sensitivity of recognition performance to noise added at each spatial frequency. Existing critical band masking studies show that humans recognize periodic patterns (gratings) and letters by means of a spatial-frequency filter (or channel) that has a frequency bandwidth of one octave (doubling of frequency). Here, we introduce critical band masking as a task for network-human comparison and test 14 humans and 76 neural networks on 16-way ImageNet categorization in the presence of narrowband noise.